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LF has been designed and successfully used as a meta-logical framework to represent and reason 
about object logics. Here we design a representation of the Isabelle logical framework in LF using 
the recently introduced module system for LF. The major novelty of our approach is that we can 
naturally represent the advanced Isabelle features of type classes and locales. 

Our representation of type classes relies on a feature so far lacking in the LF module system: 
morphism variables and abstraction over them. While conservative over the present system in terms 
of expressivity, this feature is needed for a representation of type classes that preserves the modular 
structure. Therefore, we also design the necessary extension of the LF module system. 

1 Introduction 

Both Isabelle and LF were developed at roughly the same time to provide formal proof theoretic frame- 
works in which object logics can be defined and studied. Both use the Curry-Howard correspondence to 
represent the proofs of the object logic as terms of the meta-logic. 

Isabelle |[T3l |14|| is based on intuitionistic higher-order logic [ 2 ] with shallow polymorphism and 
was designed as a generic LCF-style interactive theorem prover. LF O is the corner of the A -cube 0]] 
that extends simple type theory with dependent function types and is inspired by the judgments-as-types 
methodology ifTOl . We will work with the Twelf implementation of LF ifToTl . 

It is straightforward to represent Isabelle's underlying logic as an object logic of LF (see, e.g., 0). 
However, Isabelle provides a number of advanced features that go beyond the base logic and that cannot 
be easily represented in other systems. These include in particular a module system (9JI3 and a structured 
proof language ifTTTl . 

Recently, we gave a module system for LF in [ 1 8 ] . We wanted to choose primitive notions that are so 
simple that they admit a completely formal semantics. While such formal semantics are commonplace 
for type theories - in the form of inference systems - they quickly get very complex for module systems 
on top of type theories. At the same time these primitives should be expressive enough to admit natural 
representations of modular design patterns. Here by "natural", we mean that we are willing to accept 
lossy (in the sense of being non-invertible) encodings of modular specifications as long as their modular 
structure of sharing and reuse is preserved. 

In this paper we give such a representation of the Isabelle module system in the LF module sys- 
tem. The main idea of the encoding is that all modules of Isabelle (theories, locales, type classes) are 
represented as LF signatures, and that all relations between Isabelle modules (imports, sublocales, inter- 
pretations, subclasses, instantiations) are represented as LF signature morphisms. 

Thus, our contribution is two-fold. Firstly, we validate the design of the LF module system by show- 
ing that it provides just the right primitives needed to represent the Isabelle module system. Actually, 
before arriving at that conclusion we identify one feature that we have to add to the LF module system: 
abstraction over morphisms. And secondly, we show how LF can be used as a concise intermediate 
language in order to translate Isabelle libraries to other systems. Moreover, for researchers familiar with 
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LF but not with Isabelle, this paper can complement the Isabelle documentation with an LF-based per- 
spective on the foundations of Isabelle. However, an implementation of our representation must remain 
future work. 

In Sect. [2] we will repeat the basics of Isabelle and LF to make the paper self-contained. In Sect. [3] 
we extend the LF module system with abstraction over morphisms. Then we give our representation in 
Sect. 13 

2 Preliminaries 
2.1 Isabelle 

Isabelle is a mature and widely used system, which has led to a rich ontology of Isabelle declarations. 
We will only consider the core and module system declarations in this paper. And even among those, we 
will restrict attention to a proper subset of Isabelle 's power. 

For the purposes of this paper, we make some minor adjustments for simplicity and consider Is- 
abelle's language to be generated by the grammar in Fig. [I] Here | and * denote alternative and repetition, 
and we use special fonts for nonterminals and keywords. 



theory : 
thycont : 

locale : 
sublocale : 
interpretation : 
instance : 
class : 
instantiation : 


:= theory name imports name* begin thycont end 
:= {locale 1 sublocale 1 interpretation 1 
1 class 1 instantiation 1 thysymbol)* 

:= locale name = {name : instance)* for locsymbol* + locsymbol* 

:= sublocale name < instance proof* 

:= interpretation instance proof '* 

:= name where namedinst* 

:= class name = name* + locsymbol* 

:= instantiation type :: {name*)name begin locsymbol* proof* end 


thysymbol : 


:= consts con 1 def s def 1 axioms ax 1 lemma lem 




1 typedecl typedecl 1 types types 


locysymbol : 


:= fixes con 1 defines def 1 assumes ax 1 lemma lem 


con : 


:= name :: type 


def : 


:= name : name var* = term 


ax : 


:= name : Prop 


lent : 


:= name : Prop proof 


typedecl : 


:= (var*) name 


types : 


:= {var*) name = type 


namedinst : 


:= name = term 


type : 
term : 
Prop : 
proof : 
name, var : 


:= var :: name 1 name 1 {type, ... , type) name 1 type =>■ type 1 prop 

:= var 1 name 1 name term* 1 X(yar :: type)* .term 

■= Prop Prop 1 /\{var :: type)* .Prop 1 term = term 

:= a primitive Pure inference as defined in Ell p. 7] 

:= identifier 



Figure 1: Simplified Isabelle Grammar 
A theory is a named group of declarations. Theories may use imports to import other theories, 
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which yields a simple module system. Within theories, locale and type class declarations provide further 
sources of modularity. Theories, locales, and type classes may be related using a number of declarations 
as described below. 

The core declarations occurring in theories (thysymbol) and locales (locsymbol) are quite similar. 
consts and fixes declare typed constants c :: T. defs and defines declare definitions for a constant / taking 
n arguments as f_def : f x\ ...x n = t where t is a term in the variables X{. axioms and assumes declare 
named axioms a asserting a proposition cp as a : (p. lemma declares a named lemma / asserting (p with 
proof P as / : (p P. 

Furthermore, in theories, typedecl declares n-ary type operators t as (a\ a n ) t, and similarly types 
declares an abbreviation t for a type X in the variables a, as (a,\ , . . . , a n ) t = T. Locales do not contain 
type declarations. However, they may declare new types indirectly by declaring constants whose types 
have free type variables, e.g., o:a=^a=>aina locale for groups. References to these types are made 
indirectly using type inference, e.g., if there is another constant e : j8, then an axiom x o e = x enforces 
that a and j3 refer to the same type. 

The constant declarations within a locale serve as parameters that can be instantiated. The intuition 
is that a locale instance loc where a takes the locale with name loc and translates it into a new context 
(which can be a theory or another locale). Here a is a list of parameter instantiations (namedinst) of the 
form c = t instantiating the parameter c of loc with the term t in that new context. 

Locale instances are used in two places. Firstly, locale declarations may contain a list of instances 
used to inherit from other locales. In a locale declaration 

locale loc = ins\ : loc\ where C\ ... ins n : loc„ where a„ for £ + £' 

the new locale loc inherits via n named instances: Instance ins/ inherits from the locale Zoc,- via the list 
of parameter instantiations a,. £ and £' declare the core declarations of the locale. 

The set of constant declarations of the locale is defined as follows: (i) The declarations in £ logically 
precede the instances, i.e., are available in a, and £'. (ii) A copy of the declarations of each loc\ translated 
by 0\ is available in each a ; - for j > i and in £'; the names ins/ serve as qualifiers to resolve name clashes 
if two declarations of the same name are present, (iii) The declarations in £' are only available in £'. 

The a, do not have to instantiate all parameters of loci - parameters that are not instantiated become 
parameters of loc. Thus, the parameters of loc consist of the not-instantiated parameters of the loci and 
the constants declared in £ and £'. 

Secondly, a declaration sublocale loc' < loc where a % postulates a translation from loc to loc' , 
which maps the parameters of loc according to a. The axioms and definitions of loc induce proof 
obligations over loc' that must be discharged by giving a list % of proofs. If all proof obligations are 
discharged, all theorems about loc can be translated to yield theorems about loc', and Isabelle does that 
automatically. A locale interpretation is very similar to a sublocale. The difference is that all loc 
expressions are translated into the current theory rather than into a second locale. 

The concepts of locales and type classes have recently been aligned [ 5 ] , and in particular type classes 
are also locales. But the syntax still reflects their different use cases. A type class is a locale inheriting 
only from other type classes and only without parameter instantiations. Thus, the locale syntax can be 
simplified to class C = C\ . . . C n + £ where C inherits from the Q. All declarations in £ may refer to at 
most one type variable, which can be assumed to be of the form a :: C. The intuition is that £ provides 
operations c\,...,c n that are polymorphic in the parametric type a and axioms about them. 

An instance of a type class is a tuple (z,c\_def, . . . ,c n _def) where T is a type and C[_def is a 
definition for c,- at the type T. Because every c; can only have one definition per type, the definitions can 
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be inferred from the context and be dropped from the notation; then a type class can be seen as a unary 
predicate on types T. Type class instantiations are of the form 

instantiation t :: (C\, . . . ,C n )C begin £ 71 end 

where t is an «-ary type operator, i.e., a type with n free type variables a,-. £ contains the definitions 
for the operations of C at the type {(X\ , . . . , a„)t in terms of the operations of the instances a,- :: Q. This 
creates proof obligations for the axioms of C, and we assume that all the needed proofs are provided as 
a list 71. The semantics is that if T, :: C, are type class instances, then so is (ti, . . . z n )t :: C. Note that this 
includes base types for n = 0. 

Example 1. The following sketches two type classes for orderings and semilattices with universe a, 
ordering <, and infimum n (where we omit inferable types and write • for empty lists): 

class order = ■ + <:: a =>■ OC => prop 
class semlat = order + n::a=^a=^a 
locale lat = inf : semlat where • 

sup : semlat where <= XxXy. y inf. < x f or • + • 

Here the omitted axioms in semlat would enforce that the type variables a in the types of < and n refer 
to the same type. Then a locale for lattices is obtained by using two named instances of a semilattice 
where the second one flips the ordering. The parameters of lat are inf. < (the ordering), inf.H (the 
infimum), and sup .n (the supremum), but not sup . <, which is instantiated. 

Finally the inner syntax for terms, types, propositions, and proof terms - also called the Pure lan- 
guage - is given by an intuitionistic higher-order logic with shallow polymorphism. Types are formed 
from type variables a :: C for type classes C, base types, type operator applications, function types, and 
the base type prop of propositions. Type class instances of the form T :: C are formed from type variables 
a : : C and type operator applic ations (ti , . . . , T n )t for a corresponding instantiation t : : (Ci , . . . , C n )C and 
type class instances T, :: Q. We will assume every type to be a type class instance by using the special 
type class Type of all types. 

Terms are formed from variables, typed constants, application, and lambda abstraction. Constants 
may be polymorphic in the sense that their types may contain free type variables. When a polymorphic 
constant is used, Isabelle automatically infers the type class instances for which the constant is used. 
Propositions are formed from implication, universal quantification over any type, and equality on any 
type. 

We always assume that all types are fully reconstructed. Similarly, we cover neither the Isar proof 
language nor tactic invocations. Instead, we simply assume primitive inferences from Pure's natural 
deduction calculus ED . i.e., using introduction/elimination rules for conjunction and implication, re- 
flexivity and substitution rules for equality, as well as axioms for a/3 rj -conversion and extensionality. 

2.2 LF 

The non-modular declarations in an LF signature are kinded type family symbols a : K and typed constants 
c : A. Both may carry definitions, e.g., c : A = t introduces c as an abbreviations for t. The objects of 
Twelf are kinds K, kinded type families A : K, and typed terms t : A. type is the kind of types, and 
A — > type is the kind of type families indexed by terms of type A. We use Twelf notation for binding 
and application: The type Y\ x -aB{x) of dependent functions taking x : A to an element of B(x) is written 
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{x : A} B x, and the function term X^t (x) taking x : A to t (x) is written [x :A]tx. We write A — > B instead 
of {x : A}B if x does not occur in B, and we will also omit the types of bound variables if they can be 
inferred. 

The Twelf module system |[T9ll is based on the notions of signatures and signature morphisms 1H. 
Given two signatures sig S = {£} and sig T = {£'}, a signature morphism from S to T is a type/kind- 
preserving map n of E-symbols to E'-expressions. Thus, pL maps every constant c : A of E to a term 
ju(c) : Jl(A) and every type family symbol a : K to a type family jti(a) : n(K). Here, jti(— ) doubles as 
the homomorphic extension of }i, which maps closed E-expressions to closed £' expressions. Signature 
morphisms preserve typing and kinding, i.e., if hx E : F, then hjy n{E) : ju(F). 

Signature declarations are straightforward: sigT = {£}. Signatures may be nested and may include 
other signatures. Basic morphisms are given explicitly as {a : 5 — > T}, and composed morphisms are 
formed from basic morphisms, identity, composition, and two kinds of named morphisms: views and 
structures. Q 

We will use the following grammar where the structure identifiers T.s and the symbol identifiers S.c^ 
and S.a^ are described below: 



Signature graphs 


G : 


:= -\G, sigT = {£} 


G, view v : S — > T = }X 


Signatures 


E : 


:= ■ | E, sigT = {£}| 


E, include 5 






E, struct s : S = 


{a} | E, c:A[=t] \ E, a:K[=A 


Morphisms 


a 


:= ■ | cr, struct s := n 


\ o, c :=t \ o, a :=A 


Compositions 


jU : 


:= T.s\{o:S->T}\v 


id incl H }X 


Contexts 


r : 


:= -\r,x:A 




Kinds 


K : 


:= type | A— > K 




Type families 


A : 


:= S.aV \At | {x:A}A 




Terms 


t : 


:= S.c M | jc | [x:A]t\tt 





Modular LF uses the following judgments for well-formed syntax: 



> G 


well-formed signature graphs 


G\> n:S^T 


morphism between signatures S and T declared in G 


g \> T rctx 


contexts for signature T 


G;T\> T E :E' 


E has type/kind E' over signature T and context T 



The judgment for signature graphs mainly formalizes uniqueness of identifiers and type-preservation 
of morphisms based on the typing judgment for expressions. The judgments for contexts and typing are 
essentially the same as for non-modular LF except that the identifiers available in signature T and their 
types are determined by the module system. Therefore, we only describe the judgments for identifiers 
and morphisms and refer to [18 ] for details. 

Morphisms In this paper, we only consider a simplified language and employ the following condition 
on all morphisms from S to T: T must include all signatures that S includes, and if S includes R, the 
application of /I to symbols of R is the identity. In particular, views and structures may only be declared 
if this condition holds0 

'Explicit morphisms are actually not present in [ 19|. They are easy to add conceptually, but are a bit harder to add to Twelf 
as they violate the phase distinction between modular and non-modular syntax kept by the other declarations. We will need 
them later on. 

2 The Twelf implementation covers the general case. 
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Firstly, the semantics of a structure declaration struct s : S = {a} in T is that it is equivalent to 
the following induced declarations: (i) for every constant c : A of S a constant s.c : T.s(A) in T, and (ii) 
a morphism T.s from S to T that maps every symbol c of S to s.c. Here a is a partial morphism from S 
to T, and if a contains c := ?, the constant s.c is defined as t. In particular, ? must have type T.s(A) over 
T. The same holds for type family symbols a. Thus, structures instantiate parametric signatures. 

Because structures are named, a signature may have multiple structures of the 
same signature, which are all distinct. For example, if S already contains a structure r 
instantiating a third signature R, then struct r 1 :R'mT leads to the two morphisms 
r 1 and the composition S.r T.s from R to T and two copies of the constants of R. 
Structures may instantiate whole structures at once: If T declares instead struct r 1 : 
R = {struct r := T.r 1 }, then the two copies of R are shared. More generally, a may 
contain instantiations struct r := ii for a morphism ji from/? to T, which is equivalent to instantiating 
every symbol c of R with jJ.(c). Another way to say this is that the diagram on the right commutes. 

Secondly, the semantics of anonymous morphisms {a : S — > T} is straightforward. They are well- 
formed if a is total and map all constants according to a. Thirdly, views v are just names given to existing 
morphisms. 

Fourthly, inclusion, identity and composition are defined by 

sigT = {£} in G include S in £ 

incl 

G > incl :S^T 

sig T = {£} in G G\> ii:R^S G > y! : S ->• T 
id comp 

G\>id:T -^T G > ju ju' : # ->• T 



Identifiers Defining which symbol identifiers are available in a signature is intuitively easy, but a 
formal definition can be cumbersome because all included symbols and those induced by structures have 
to be computed along with their translated types and definitions. Using morphisms and the novel notation 
S.c* 1 for symbol identifiers, we can give a very elegant definition: 

sigS = {Z}inG c:£in£ G > ju : S ->• T 

tp 

G\> T S.c^ : n(E) 

and similarly for defined symbols and type family constants. The price to pay is an awkward notation, 
but we can recover the usual notations as follows: 

• id yields local symbols, and we write c instead of T.c' d . 

• incl yields included symbols, and we write S.c instead S.c mcl . 

• If T contains a structure from S, we have G > T.s : S — ^ T, and we write s.c instead of S.c T s . 
Accordingly, we introduce constants s.r.c for composed morphisms S.r T.s from/? to T, and so on. 

• All other identifiers T.c^, e.g., those where ii contains views or anonymous morphisms, are re- 
duced to one of the other cases by applying the morphism. 
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Functors While views are well-established in logical frameworks based on model theory (see, e.g., 
lH|20l), they are an unusual feature in proof theoretical frameworks. (In fact, the LF module system has 
been criticized for using views instead of functors or even - in light of Q - for using either one rather 
than only structures.) Therefore, we quickly describe how functors are a derived notion in the presence 
of views and anonymous morphisms. 

Assume a functor F from S to T. Its input is a structure s instantiating S, and its output is a list % of 
instantiations for the symbols of T. Here X may refer to the symbols induced by s. We can write this in 
LF as 

sigFo = {struct s : S} view/ 7 : T — > Fq = {t} 

Now given a theory D, we can understand instances of a signature S over D as morphisms from S to D. 
This is justified because a morphism from S to D realizes every declaration of S in terms of D. More 
generally we can think of morphisms from S to D as implementations or models of S in terms of D. The 
application of F should map instances of S to instances of T. Thus, given a morphism /J. from S to D, we 
can write the application F(fi) as the composed morphism 

F {struct s := }X : F — > D} 

which is indeed a morphism from T to D. 

3 Morphism Variables in LF 

We add a feature to the LF module system that permits morphism variables and abstraction over them. 
Therefore, we add the following productions to the grammar: 

Contexts T ::= r, X : S Compositions /I ::= X 

Due to the presence of morphism variables, the judgment for well-formed morphisms must be amended 
to depend on the context. Then we can give the typing rules as: 

G> r rctx sig S = {£} in G G > r T, X : S Ctx 

contmor morvar 

G> r r, X:SCtx G;r,X:S>X:S->T 

where we retain the restriction on signature inclusions: All signatures included into S must also be 
included into T. 

Note that we can understand the signature S as a (dependent) record type, a morphism ;U : S —>T 
as a record value of type S visible in the signature T, and an identifier S.c^ as the projection out of the 
record type S at the field c applied to \i. Then X is simply a variable of record type, and abstraction over 
morphism variables is straightforward: 

Type families A ::= {X : S}A Terms t ::= [X:S]t\tpL 

and (omitting the obvious fl-rule and the rules for /3 and rj -conversion) 



G;T, X :S> T t :A G; T > T f : {X : S}A G; T > \i : S -»• T 

morlam morapp 

G;T> T [X :S]t : {X : S}A G;T > T f }X : A[X / >] 
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Here t and A may contain occurrences of the morphism variable X. In particular X may occur as a 
morphism argument to some expression, e.g., g X, or in an identifier S.cr, which we write as X.c in 
accordance with our notation for structures. 

A crucial feature of the LF module system is that it is conservative: Modular signatures can be 
elaborated into non-modular ones (essentially by replacing every structure declaration with the induced 
constant declarations). We want to elaborate morphism variables similarly. 

To elaborate X : S, we can assume that all structures in 5 have already been elaborated and that all 
defined symbols have been removed by expanding definitions, i.e., (up to reordering) 5 is of the form 
sigS = {include R\,..., include R m , c\ :Bi,...,c„ :B n }. Then [X : S] t in a signature T is elaborated 
to \x\ : B\] ...[x n : B' n ] t' where for expressions E over S, we obtain E' by replacing every occurrence of 
X with the morphism {c\ .= x\,...,c n :=x n : S —>T}. (In particular, after using morphism application, 
the identifiers S.cf simply become Xj.) {X : S}A is elaborated accordingly. Finally, t il is elaborated to 
tn(S.d) ... n(S.c„). 

This extended module system is not conservative over LF: 5 may contain type declarations, but LF 
does not permit abstraction over type variables. But we obtain conservativity if we make the following 
additional restriction: Contexts F,X : S are only well-formed if all type family symbols R.a^ available in 
S are included from other signatures, i.e., il = incl . . . inch Conversely, neither 5 nor any signature that 
S instantiates may contain type family declarations. 

This restriction may appear to be introduced ad hoc, but in fact we consider it quite natural. Assume 
we have LF signatures r L n and r T~ l that represent an object logic L and a theory T of L. Then, typically, 
r L~ l contains type declarations for the syntactic categories and judgments of L and constant declarations 
for the logical symbols and inference rules; r T~ l includes r L n and adds constant declarations for the 
non-logical symbols (sorts, functions, predicates, etc.) and axioms. Thus, our extension lets us abstract 
over morphisms out of theories but not over morphisms out of object logics. And the former are exactly 
the morphisms that we are interested in because morphisms out of r T~ l can be used to represent models 
or implementations of T. In particular, below, T will be an axiomatic type class and morphisms out of 
r T~ l will be type class instances. 

4 Representing Isabelle in LF 

The representation of Isabelle in LF proceeds in two steps. In a first step, we declare an LF signature 
Pure for the inner syntax of Isabelle. This syntax declares symbols for all primitives that can occur 
(explicitly or implicitly) in Pure expressions. In a second step, every Isabelle expression E is represented 
as an LF expression r E~ l . Finally we have to justify the adequacy of the encoding. 

For the inner syntax, the LF signature Pure is given in Fig. [2] This is a straightforward intrinsically 
typed encoding of higher-order logic in LF (e.g., as in [6]). Pure types z are encoded as LF-terms r z~ l : tp 
and Pure terms t :: z as LF-terms r t n : tm r z~ l . Using higher-order abstract syntax, the LF function space 
A — > B with A-abstraction [x : A] t and application / 1 is distinguished from the encoding tm ( r a n =>• r T n ) 
of the Isabelle function space with application r f~ l @ r t~ l and A -abstraction X{[x : tm r z~ ] ] r t~ l ). Pure 
propositions cp are. encoded as LF-terms r cp~ l : tm prop, and Pure inferences P proving q> as LF-terms 
r P~ 1 of type h r (p~ l . Where possible, we use the same symbol names in LF as in Isabelle, and we can 
also mimic most of the Isabelle operator fixities and precedences. 

The signature Pure only encodes how composed Pure expressions are formed from the atomic ones. 
The atomic expressions - variables and constants etc. - are added when encoding the outer syntax as 
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sig Pure 
tp 

=>• 
tm 
A 



prop 

A 



{ 



type. 

tp-¥tp-¥ tp. 
type. 

?m (A =^ B) — > 



(fmA 



->■ ?m (A B) 
fraA — >■ tmB. 



tp. 

[tmA —7- ?m prop) — > tm prop, 
tm prop —7- tm prop — > tm prop, 
tm A — > tm A — > tm prop. 



infix right =>. 
prefix fm. 

infix left 1000 @. 



infix right 1 ==>. 
infix none 2 =. 



h 

AI 

AE 

=>I 

=>E 

refl 

subs 

exten 

beta 

eta 



?m pro/? —7- type. 
(x:?mAh(fix)) h A([4Bx). 
hA(HB^) -> {x:?mA}h (Bx). 
(HA ^hB) — > h A =>fi. 
h A ==> B ^hA -^hB. 
hX=I. 

{F:taA -^fmB}hI = l' — > h F X = F F. 
{x:?mA}h(Fx) = (Gx) h A/ 7 = AG. 
h (A[x:?mA]Fx) @Z = FX. 
h A ([x:fmA]F @x) =F. 



prefix K 



}• 



sig Type = {this:tp.}. 



Figure 2: LF Signature for Isabelle 



LF declarations. For the non-modular declarations, this is straightforward, an overview is given in the 
following table: 



Expression 


Isabelle 


LF 


base type, type operator 
type variable 
constant 
variable 

assumption/axiom/definition 
theorem 


(ai,...,an)t 
a 

c :: t 
x :: T 
a : (p 
a:(pP 


f.tp—?- ...—ttp^-tp 
a : tp 
c : tm r T n 
x : tm r T n 
a:h r <p n 
a:h r (p^ = r P^ 



The main novelty of our encoding is to also cover the modular declarations. The basic idea is 
to represent all high-level scoping concepts as signatures and all relations between them as signature 
morphisms as in the following table: 



Isabelle 


LF 


theory, locale, type class 
theory import 

locale import, type class import 

sublocale, interpretation, type class instantiation 

instance of type class C 


signature 

morphism (inclusion) 
morphism (structure) 
morphism (view) 
morphism with domain C 
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In the following, we give the important cases of the mapping r — n from Isabelle to LF by induction on 
the Isabelle syntax. We occasionally use color to distinguish the meta-level symbols (such as =) from 
Isabelle and Twelf syntax such as =. 

Theories Isabelle theories and theory imports are encoded directly as LF-signatures and signature in- 
clusions. The only subtlety is that the LF encodings additionally include our Pure signature. 

r theory T imports T\ , . , . , T„ begin £ end n = 

sig T = { include Pure, include T\. ... include T n . r TT 1 }. 

where the body £ of the theory is translated component-wise as described by the respective cases below. 

Type Classes The basic idea of the representation of Isabelle type classes in LF is as follows: An 
Isabelle type class C is represented as an LF signature C that contains all the declarations of C and a field 
this : t p. All occurrences in C of the single permitted type variable a : : C are translated to this such that 
this represents the type that is an instance of C. 

This means that a is not considered as a type variable but as a type declaration that is present in the 
type class. This change of perspective is essential to obtain an elegant encoding of type classes. 

In particular, the subsignature Type of Pure represents the type class of all types. Morphisms with 
domain Type are simply terms of type tp, i.e., types. 

The central invariant of the representation is this: An Isabelle type class instance z :: C is represented 
as an LF morphism r T :: C n from C into the current LF signature that maps the field this to r T n and all 
operations of C to the encoding of their definitions at X. Thus, in particular, r T :: C n (C.this) = r T n . 
Example 2 (Continued). The first type class from Ex.[T]is represented in LF as follows: 

sig order = {this : tp. <: tm(this => this prop)} 

In general, we represent type classes as follows: 

r class C = C\ . . . C n +Z n = sig C = {this :tp.I\. ... I n . r Z n }. 

where abbreviates struct insj : C ; - = {this := this p,} for some fresh names insj. Since one this is 
imported from each superclass C,, they must be shared using the instantiations this := this, p,- contains 
one structure sharing declaration for each type class imported by /, that has already been imported by 
h,...,Ii-i. 

Example 3 (Continued). The second type class from Ex.[T]is represented in LF as follows: 

sig semlat = {this : tp. struct o : order = {this := this}, n : tm(this => this => this)} 

A type class instantiation 

instantiation t :: (C\, ...,C n )C begin £ 71 end 

is represented as an LF functor taking instances of the C, and returning an instance of C. We represent 
such a functor as a signature 

sig V = {struct ai : C\ ... struct a„ :C n }. 
collecting the input and a view 

view v' : C— > V = {this := t a^.this . . . a n .this r Z nr 7T n }. 
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describing the output, v' must map the field tp of C to the type that is an instance of C. This type is 
obtained by applying t to the argument types that are instances of the Q. In Isabelle, this is t (X\ ... a„; 
in LF, each a,- is a structure of C,, thus we use the induced constants oti.this. 

Here r XP gives instantiations that map every constant of C to its definition in terms of the a,. Simi- 
larly, r 7T n maps every axiom of C to its proof. Note how - in accordance with the Curry-Howard repre- 
sentation of proofs as terms - the discharging of proof obligations is just a special case of instantiating a 
constant. 

Now assume type class instances T; :: C; encoded as morphisms r Zj :: C; n : Q — >■ S (where S is the 
current signature). The encoding r (ti , . . . , z n )t :: C n : C — > S is obtained as the composition 

V {struct a\ := r Ti :: Ci n . . . struct a n := r z„ :: C„ n : V ->■ 5}. 

Clearly this is a morphism from C to S; we need to show that indeed 

r (Ti, . . .,Z n )t :: C n (C.this) = r tZi ... T„ n . 

This holds because 

r (Ti,...,T„)f :: C^(C.this) = {...struct a,- := r T,- :: C,- n . . .}(v' (C.this)) 

{. . . struct Of; := r T,- :: C, n . . .}(? (X\.this . . . a n .this) = 

t r Ti :: C\^{C\.this) . . . r T„ :: C n ^(C n .this) = t r Ti n ... r T„ n = r f Ti ... T„ n 

We have the general result that the Isabelle subclass relation CCD holds iff there is an LF morphism 
i : D — > C. Then if the type class instance z :: C (occurring in some theory or locale S) is represented as a 
morphism r z :: C n : C — )• 5, the type class instance T :: D is represented as r T :: D n = / r T :: C n . Isabelle 
has the limitation that there can be at most one way how C is a subclass of D, which has the advantage 
that i is unique and can be dropped from the notation. In LF, we have to make it explicit. 
Example 4 (Continued). The trivial subclass relation order C Type is represented by the morphism 
i = {this := this : Type — > order}. The subclass relation semlat C order is represented by the morphism 
semlat.o. Finally, the morphism i semlat. o = {this := this : Type — > semlat] represents semlat C Type. 

Locales Similarly to type classes, Isabelle locales are encoded as subsignatures: For example, 

locale loc = ins\ : loc\ where G\ for Z + E 
is encoded as the LF signature 

sig loc = {® r I? struct ins^ :loc\ = { r di n }. r l' n }. 

Here contains type declarations a : tp for the free type variables of the locale. Those are the free 
type variables that occur in the declarations of £ and These correspond to the single declaration 
this : tp in type classes. This encoding of type variables may be surprising because free type variables 
correspond to universal types whereas the declarations in & correspond to existential types. We hold that 
our LF-encoding precisely captures the intended meaning of locales, whereas the definition of locales 
within Isabelle prefers universal type variables in order to be compatible with the underlying type theory. 

If a locale inherits from more than one locale, the encoding is defined correspondingly using one 
structure struct ins, : loc\ = {#; r C7,~ 1 } for each locale instance insi : loci where a,-. Here contains the 
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instantiations for free type variables of loci that are induced by a, and inferred by Isabelle. Furthermore, 
some additional sharing declarations become necessary due to a subtlety in the semantics of Isabelle 
locales: If a locale inherits two equal instances (same locale, same instantiations), they are implicitly 
identified. But in LF different structures are always distinguished unless shared explicitly. Therefore, we 
have to add to r a, n one sharing declaration struct ins := ins' for each instance ins present in locj that 
is equal to one already imported by one of ins\,. . . , ins^\. 
Example 5 (Continued). The locale from Ex.[T]is represented in LF as follows: 

sig lat = { 

struct inf : semlat. 

struct sup : semlat = {this := inf .this. <:= X[x] X\y] inf . < @y@x}. 

} 

Note how the instantiation for < induces an instantiation for the type this. In other words, the f} men- 
tioned above is this := inf .this. 

Sublocale declarations are encoded as views from the super- to the sublocale. Thus, the declaration 

sublocale loc' < loc where o % 

is encoded as (for some fresh name v): 

view V : loc — > loc' = {■& r o° r 7T n }. 

Here # contains the instantiations of the free type variables (see above), which are inferred by Isabelle 
based on the instantiations in a. 

Locale interpretations are interpreted in the same way except that the codomain is the current LF 
signature (which encodes the Isabelle theory containing the locale interpretation) instead of the sublocale. 

As for type classes, we have the general result that loc is a sublocale of loc' iff there is an LF signature 
morphism from loc to loc' . Accordingly, loc can be interpreted in the theory T iff there is a morphism 
from loc to T . For example, loc is a sublocale of loc\ from above via the composed morphism v loc.ins\. 
Contrary to type classes, there may be several different sublocale relationships between two locales. In 
LF these are distinguished elegantly as different moiphisms between the locales. 

Example 6 (Continued), lat is a sublocale of semlat in two different ways represented by the LF mor- 
phisms lat. inf and lat. sup. These are trivial sublocale relations induced by inheritance. 

Constant Declarations Finally we have to represent those aspects of the non-modular declarations 
that are affected by type classes. We will only consider the case of constants. Definitions, axioms, and 
theorems are represented accordingly. The central idea is that free type variables constrained by type 
classes are represented using X abstraction for morphism variables. 

An Isabelle constant c :: z with free type variables a, :: Q is represented as the LF-constant taking 
morphism arguments: 

c : {oti : Ci} . . .{a n : C n }tm r z~ [ . 

Here in r T n every occurrence of the morphism variable a, is represented as cti.this. 

Whenever c is used with inferred type arguments T ; - :: C,- in a composed expression, it is represented 
by application of c to morphisms: 



r c n = c r Ti ::Cr ••• r T„::C,^. 



F. Rabe 



97 



Actually, we cannot use the same identifier c in LF as in Isabelle: Instead, we must keep track how c 
came into scope. For example, if c was imported from some theory S, we must use S.c in LF; if the 
current scope is a locale and c was imported from some other locale via an instance ins, we must use 
ins.c in LF; if c was moved into the current theory from a locale loc via an interpretation declaration 
which was encoded using the fresh name v, we must use loc.c v in LF, and so on. 

Types The representation of types was already indicated above, but we summarize it here for clarity. 
Type operator declarations (d\ , . . . , a n )t are encoded as constants t : tp — >• . . . — >• tp — >■ tp. And types 
occurring in expressions are encoded as 

r a::C n = a.this 

r r = t 

r (Tl,...,T„)f n = f r Ti nr T„ n 
r T!=>T„ n = r rr=> r T 2 n 

r prop^ = prop. 



Adequacy Before we state the adequacy, we need to clarify in what sense our representation is ade- 
quate. In Isabelle, locales and type classes are not primitive notions. Instead, they are internally elab- 
orated into the underlying type theory. For example, all declarations in a locale or a type class are 
relativized and lifted to the top level. Thus, they are available elsewhere and not only within the locale. 
While there are certainly situations when this is useful, here we care about the modular structure and 
the underlying type theory, but not about the elaboration of the former into the latter. Therefore, we 
do not want a representation in LF that adequately preserves the elaboration. In fact, if we wanted to 
preserve the elaboration, we could simply use Isabelle to eliminate all modular structure and represent 
the non-modular result using well-known representations of higher-order logic in LF. 

Therefore, we have to forbid all Isabelle theories where names are used outside their scope. Let us call 
an Isabelle theory simple if all declared names are only used in their respective declaration scope - theory, 
locale, or type class - unless they were explicitly moved into a new scope using imports, sublocale, 
interpretation, or instantiation declarations, or using inheritance between type classes and lo- 
cales. 

Then we can summarize our representation with the following theorem: 

Theorem 7. A simple sequence of Isabelle theories T\ ... T n is well-formed (in the sense of Isabelle) iff 
the LF signature graph Pure r Ti~ l ... r T n ~ l is well-formed (in the sense ofLF extended with morphism 
variables). 

Proof. To show the adequacy for the encoding of the inner syntax is straightforward. A similar proof 
was given in 0. 

The major lemmas for the outer syntax were already indicated in the text: 

• For an Isabelle type class instance z :: C used in theory or locale S and context T, we have G; T > 
r T :: C n : C -> S and r T :: C n {C.this) = r T n . 

• There is an Isabelle sublocale relation loc' < loc via instantiations a whenever the incomplete LF 
morphism { r a n . . . : loc — > loc'} can be completed (by instantiating the axioms of loc with proof 
terms over loc'). 
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The main difficulty in the proofs is to show that at any point in the translated LF signatures exactly 
the right atomic expressions are in scope. This has to be verified by a difficult and tedious comparison 
of the Isabelle documentation with the semantics of the LF module system. In particular, in our sim- 
plified grammar for Isabelle, we have omitted the features that would break this result. These include 
in particular the features whose translation requires inventing and keeping track of fresh names, such as 
overloading and unqualified locale instantiation. □ 

The above proof is not quite convincing, even vague. The problem is that a more elaborate proof 
would require formal definitions of well-formedness for both module systems, and these are beyond the 
scope of this paper. (In fact, no comprehensive reference definition is available yet for the semantics of 
the modular syntax of either system.) 

5 Conclusion 

We have presented a representation of Isabelle's module system in the LF module system. Previous logic 
encodings in LF have only covered non-modular languages (e.g., ||6l[8l[T5l), and ours is the first encoding 
of a modular logic. We also believe ours to be the first encoding of type classes or locale-like features in 
any logical framework. 

The details of the translation are quite difficult, and a full formalization requires intricate knowl- 
edge of both systems. However, guided by the use of signatures and signature morphisms as the main 
primitives in the LF module system, we could give a relatively intuitive account of Isabelle's structuring 
mechanisms. 

Our translation preserves modular structure; in particular the translation is compositional and the size 
of the output is linear in the size of the input. We are confident that our approach scales to other systems 
such as the type classes of Haskell or the functors of SML, and thus lets us study the modular properties 
of programming languages in logical frameworks. Moreover, we hold that the trade-off made in the LF 
module system between expressivity and simplicity makes it a promising starting point to investigate the 
movement of modular developments between systems. 

In order to formulate the representation, we had to add abstraction over morphisms to the LF module 
system. This effectively gives LF a restricted version of dependent record types. This is similar to the 
use of contexts as dependent records as, e.g., in ifTTl . Contrary to, e.g., and Ifl2l . the LF records may 
only occur in contravariant positions, which makes them a relatively simple conservative addition. 

An integration of this feature into the Twelf implementation of LF remains future work. Similarly, the 
use of anonymous morphisms has not been implemented in Twelf yet. In both cases, the implementation 
is conceptually straightforward. However, since it would permit the use of morphisms in terms, types, 
and kinds, it would require a closer integration of modular and core syntax in Twelf, which has so far 
been avoided deliberately. We will undertake the Twelf side of the implementation soon. 

In any case, Twelf will hardly be a bottleneck. Any implementation of a translation from Isabelle 
to LF would have to be implemented from within Isabelle as it requires Isabelle's reconstruction of 
types and instantiations (let alone proof terms). However, Isabelle currently eliminates most aspects of 
modularity when checking a theory. For example, it is already difficult to export the local constants 
of a theory because the methods provided by Isabelle can only return all local, imported, or internally 
generated constants at once. The most promising albeit still very difficult approach seems to be to use a 
standalone parser for the Isabelle outer syntax and then fill in the gaps by calling the methods provided 
by Isabelle. Thus, even though this paper solves the logical questions of how to translate from Isabelle 
to LF, the corresponding software engineering questions are non-trivial and remain open. 
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